About objective dialogue evaluation methods

نویسنده

B. W. van Schooten

چکیده

'Objective' evaluation means evaluation using numbers (metrics) which can be calculated without intervention by humans. An obvious advantage is that human eeort is reduced. It is also claimed that it is less biased by human opinion, but in this text it is shown that there are a lot of choices to be made when applying these methods, which introduces bias at a diierent level. Two objective evaluation methods which are claimed to be state-of-the-art, both developed in AT&T, are described and commented upon. Both are black-box methods, meaning that they can be used without referring to the implementation of the dialogue system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spoken dialogue translation systems evaluation: results, new trends, problems and proposals

It is important to evaluate Spoken Dialogue Translation Systems, but as we show by analyzing evaluation methods in the Verbmobil, C-STAR II, and the Nespole! projects, the current state of the art is not fully satisfactory. Subjective methods are too costly, and objective methods, although cheaper, don’t give good indications about usability. We propose some ideas to improve that situation.

متن کامل

Preparing for Evaluation of a Flight Spoken Dialogue System

Evaluation is needed to test how well the dialogue system works and is helpful for the developer to find some problems and to make the system more satisfactory. We have developed a flight spoken dialogue system and decide to carry out a thorough evaluation. Five dialogue scenarios with brief task descriptions are carefully designed. A questionnaire for user satisfaction is ready, which will be ...

متن کامل

Automatic Learning and Evaluation of User-Centered Objective Functions for Dialogue System Optimisation

The ultimate goal when building dialogue systems is to satisfy the needs of real users, but quality assurance for dialogue strategies is a non-trivial problem. The applied evaluation metrics and resulting design principles are often obscure, emerge by trial-and-error, and are highly context dependent. This paper introduces data-driven methods for obtaining reliable objective functions for syste...

متن کامل

An Evaluation Understudy for Dialogue Coherence Models

Evaluating a dialogue system is seen as a major challenge within the dialogue research community. Due to the very nature of the task, most of the evaluation methods need a substantial amount of human involvement. Following the tradition in machine translation, summarization and discourse coherence modeling, we introduce the the idea of evaluation understudy for dialogue coherence models. Follow...

متن کامل